High-performance
AI infrastructure
Cloud development made frictionless
Run generative AI models, large-scale batch jobs, job queues, and much more. Bring your own code — we run the infrastructure.
View DocsIterate at the speed of thought
Make code changes and watch your app rebuild instantly. Never write a single line of YAML again.
View DocsBuilt for large-scale workloads
Engineered in Rust, our custom container stack allows you to scale to hundreds of GPUs and then back down to zero in seconds. Pay only while it's running.
View DocsUse Cases
Generative AI Inference that scales with you
Fast cold boots
Load gigabytes of weights in seconds with our optimized container file system.
Bring your own code
Deploy anything from custom models to popular frameworks.
Seamless autoscaling
Handle bursty and unpredictable load by scaling to thousands of GPUs and back down to zero.
Fine-tuning and training without managing infrastructure
Start training immediately
Provision Nvidia A100 and H100 GPUs in seconds. Your drivers and custom packages are already there.
Never wait in line
Run as many experiments as you need to, in parallel. Stop paying for idle GPUs when you’re done.
Cloud storage
Mount weights and data in distributed volumes, then access them wherever they’re needed.
Batch processing optimized for high-volume workloads
Supercomputing scale
Serverless, but for high-performance compute. Run things on massive amounts of CPU and memory.
Serverless pricing
Pay only for resources consumed, by the second, as you spin up containers.
Powerful compute primitives
Simple fan-out parallelism that scales to thousands of containers, with a single line of Python.
Features
Flexible Environments
Bring your own image or build one in Python, scale resources as needed, and leverage state-of-the-art GPUs like H100s & A100s for high-performance computing.
Seamless Integrations
Export function logs to Datadog or any OpenTelemetry-compatible provider, and easily mount cloud storage from major providers (S3, R2 etc.).
Data Storage
Manage data effortlessly with storage solutions (network volumes, key-value stores and queues). Provision storage types and interact with them using familiar Python syntax.
Job Scheduling
Take control of your workloads with powerful scheduling. Set up cron jobs, retries, and timeouts, or use batching to optimize resource usage.
Web Endpoints
Deploy and manage web services with ease. Create custom domains, set up streaming and websockets, and serve functions as secure HTTPS endpoints.
Built-In Debugging
Troubleshoot efficiently with built-in debugging tools. Use the modal shell for interactive debugging and set breakpoints to pinpoint issues quickly.
code is running
Compute costs
GPU Tasks
Nvidia H100
$0.001267 / sec
Nvidia A100, 80 GB
$0.000944 / sec
Nvidia A100, 40 GB
$0.000772 / sec
Nvidia L40S
$0.000542 / sec
Nvidia A10G
$0.000306 / sec
Nvidia L4
$0.000222 / sec
Nvidia T4
$0.000164 / sec
CPU
Physical core
(2 vCPU equivalent)
$0.000038 / core / sec
*minimum of 0.125 cores per container
Memory
$0.00000667 / GiB / sec
of all scales
Security and governance
Built with Modal
“Modal makes it easy to write code that runs on 100s of GPUs in parallel, transcribing podcasts in a fraction of the time.”
Mike Cohen, Head of Data
“Tasks that would have taken days to complete take minutes instead. We’ve saved thousands of dollars deploying LLMs on Modal.”
Rahul Sengottuvelu, Head of Applied AI
“The beauty of Modal is that all you need to know is that you can scale your function calls in the cloud with a few lines of Python.”
Georg Kucsko, Co-founder and CTO
community
If you building AI stuff with Python and haven't tried @modal_labs you are missing out big time
@modal_labs continues to be magical... 10 minutes of effort and the `joblib`-based parallelism I use to test on my local machine can trivially scale out on the cloud. Makes life so easy!
This tool is awesome. So empowering to have your infra needs met with just a couple decorators. Good people, too!
Modal has the most magical onboarding I've ever seen and it's not even close. And Erik's walk through of how they approached it is a Masterclass.
special shout out to @modal_labs and @_hex_tech for providing the crucial infrastructure to run this! Modal is the coolest tool I’ve tried in a really long time— cannnot say enough good things.
I use @modal_labs because it brings me joy. There isn't much more to it.
I have tried @modal_labs and am now officially Modal-pilled. Great work @bernhardsson and team. Every hyperscalar should be trying this out and immediately pivoting their compute teams' roadmaps to match this DX.
I've realized @modal_labs is actually a great fit for ML training pipelines. If you're running model-based evals, why not just call a serverless Modal function and have it evaluate your model on a separate worker GPU? This makes evaluation during training really easy.
If you building AI stuff with Python and haven't tried @modal_labs you are missing out big time
@modal_labs continues to be magical... 10 minutes of effort and the `joblib`-based parallelism I use to test on my local machine can trivially scale out on the cloud. Makes life so easy!
This tool is awesome. So empowering to have your infra needs met with just a couple decorators. Good people, too!
Modal has the most magical onboarding I've ever seen and it's not even close. And Erik's walk through of how they approached it is a Masterclass.
special shout out to @modal_labs and @_hex_tech for providing the crucial infrastructure to run this! Modal is the coolest tool I’ve tried in a really long time— cannnot say enough good things.
I use @modal_labs because it brings me joy. There isn't much more to it.
I have tried @modal_labs and am now officially Modal-pilled. Great work @bernhardsson and team. Every hyperscalar should be trying this out and immediately pivoting their compute teams' roadmaps to match this DX.
I've realized @modal_labs is actually a great fit for ML training pipelines. If you're running model-based evals, why not just call a serverless Modal function and have it evaluate your model on a separate worker GPU? This makes evaluation during training really easy.
Bullish on @modal_labs - Great Docs + Examples - Healthy Free Plan (30$ free compute / month) - Never have to worry about infra / just Python
@modal_labs has got a bunch of stuff just worked out this should be how you deploy python apps. wow
If you are still using AWS Lambda instead of @modal_labs you're not moving fast enough
Recently built an app on Lambda and just started to use @modal_labs, the difference is insane! Modal is amazing, virtually no cold start time, onboarding experience is great 🚀
Probably one of the best piece of software I'm using this year: modal.com
feels weird at this point to use anything else than @modal_labs for this — absolutely the GOAT of dynamic sandboxes
Nothing beats @modal_labs when it comes to deploying a quick POC
Late to the party, but finally playing with @modal_labs to run some backend jobs. DX is sooo nice (compared to Docker, Cloud Run, Lambda, etc). Just decorate a Python function and deploy. And it's fast! Love it.
Bullish on @modal_labs - Great Docs + Examples - Healthy Free Plan (30$ free compute / month) - Never have to worry about infra / just Python
@modal_labs has got a bunch of stuff just worked out this should be how you deploy python apps. wow
If you are still using AWS Lambda instead of @modal_labs you're not moving fast enough
Recently built an app on Lambda and just started to use @modal_labs, the difference is insane! Modal is amazing, virtually no cold start time, onboarding experience is great 🚀
Probably one of the best piece of software I'm using this year: modal.com
feels weird at this point to use anything else than @modal_labs for this — absolutely the GOAT of dynamic sandboxes
Nothing beats @modal_labs when it comes to deploying a quick POC
Late to the party, but finally playing with @modal_labs to run some backend jobs. DX is sooo nice (compared to Docker, Cloud Run, Lambda, etc). Just decorate a Python function and deploy. And it's fast! Love it.